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Using methods of statistical physics, we study the average number and kernel size of general sparse 
random matrices over GF(q), with a given connectivity profile, in the thermodynamical limit of large 
matrices. We introduce a mapping of GF(q) matrices onto spin systems using the representation 
of the cyclic group of order q as the g-th complex roots of unity. This representation facilitates 
the derivation of the average kernel size of random matrices using the replica approach, under the 
replica symmetric ansatz, resulting in saddle point equations for general connectivity distributions. 
^— ^ , Numerical solutions are then obtained for particular cases by population dynamics. Similar tech- 

niques also allow us to obtain an expression for the exact and average number of random matrices 
for any general connectivity profile. We present numerical results for particular distributions. 
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I. INTRODUCTION 

Random matrices over GF(q) are highly important in a number of application areas ranging from biology to 
computer science and telecommunication. One of the areas where they play a particularly important role is coding 
theory In particular, linear codes are defined by the kernel of a parity-check matrix, where each kernel vector 
is termed a codeword and is associated with an original uncoded message vector by a linear operation denned by a 
generator matrix. Well known examples include the Hadamard codes, where properties of the kernel and rank play an 
important role [2|, and low-density parity-check codes (LDPC) which provide the best performance to date in many 
noise regimes. Although the most studied and applied case of LDPC codes is of binary codes over GF(2) there is a 
significant body of work, of both practical and theoretical nature @ , on codes over more general finite fields showing 
an improvement in performance with respect to the binary version. In particular, statistical physics based analysis of 
i ^i ' LDPC codes over GF{q) has been reported in 

Low-density parity-check codes are based on random sparse matrices, where the fraction of non-zero elements goes 
t-H ■ to zero as the size of the matrix increases. In most studies of LDPC codes, it is assumed that a parity-check matrix 
with M rows (parity-checks) and N columns defines a code of rate R=l — M/N, exactly, which is equivalent to the 
assertion that the number of vectors in the kernel (and therefore the number of codewords) is exactly q NR . 

In addition to being an interesting applied problem, the properties of these matrices are also of great interest from 
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. the pure mathematical point of view and a number of papers has already tried to answer related questions in different 
instances with a mathematical rigorous approach @, [H, Q • 
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In this contribution, we address two key properties of sparse random matrices over GF(q), namely the average 
dimension of their kernel and the number of matrices for a given connectivity profile, in the case of large matrices. 
When the matrices are large, keeping TV — > oo with M/N constant, the problem can be mapped into a system of 
interacting "spins" and the powerful machinery developed for the study of disordered spin lattices in condensed matter 
physics can then be used, under some assumptions, to obtain the required properties. 

In order to keep this paper as self-contained as possible and make it accessible to a broad readership, we provide in 
section|TT]a brief introduction to GF(q) matrices and their properties, and to the basic statistical physics methodology 
on which we have based our analysis. The usual statistical physics approach to the analysis of LDPC codes over the 
binary field GF(2) is generalized in such a way that it can be efficiently applied to any GF(q) for a general connectivity 
distribution of non-zero elements and then used to calculate the average kernel dimension of sparse random matrices 
(SRM) in section llVl Making use of techniques developed in section HVl the number of matrices for a given distribution 
of non-zero elements is then obtained for various connectivity profiles, in section [V] Finally, we present a discussion 
of the obtained results in section IVT1 
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II. KEY CONCEPTS 

A. GF(g)-Matrices 

A Galois field GF(q) is a finite field with q elements, i.e., a set of q elements {0, q — 1}, which we symbolize by 
integers for convenience, which is a commutative group under addition © : GF(q) — > GF(q), defined as integer addition 
mod q, and with a monoid structure with respect to a commutative multiplication operation (g) : GF{q) — > GF(q). 
The field also includes the zero element '0', mapping every other element to itself, and the identity '1'; an additional 
requirement is that the multiplication and addition have the algebraic distributive property. This last requirement 
restricts the number of elements to be q = p n , where p is a prime number and n an integer. 

Entries in matrices over GF(q) take values of numbers in the field GF(q), where the usual additions and multi- 
plications involved in their algebra are defined by the corresponding operations over the Galois field. The kernel, or 
null space, of an M x N matrix A is defined as the set of vectors v £ GF(q) N such that Av — 0, with all operations 
in the field GF(q). The kernel is a linear vector space and therefore will have q d ^ vectors, where d(A) is the kernel 
dimension. The rank r(A) of the matrix is obtained by the rank-nullity theorem as r(A) = N — d(A). 



B. Disordered Systems 



An interacting spin problem has two main elements: an interaction defined between a number of spin units, 
collectively represented by the vector er = (ax, ...,ojv), in a lattice and a local field which acts in each variable o~i 
separately. Disordered spin systems are systems where one or both of these elements (interaction and field) is a 
random variable. Usually, we are interested in the properties of very large systems, where the number N of spins 
becomes infinite, the so-called thermodynamic limit. 

The main properties of the system in the thermodynamic limit can be derived from a key quantity, the free-energy 
/, which in probabilistic terms corresponds to the cummulant generating function. For disordered systems, in the 
cases where the free-energy is self-averaging with respect to the disorder, we can calculate this quantity as 

where (•) indicates the disorder average, Z — e~ /3n ^ a " > is the partition function and Tt(cr) is the Hamiltonian of 
the system. Although the self-averaging property should be rigorously investigated for each system, we will assume 
it holds here. 

In order to obtain the free-energy, a powerful technique is to make use of the replica method, based on the identity 



= (lnZ>. (2) 



n=0 



Average quantities can then be calculated for integer n and then analytically continued to zero. The replica theory 
is commonly used in the area of disordered systems and is known to provide exact results in many regimes, which 
include both physical and non-physical systems Q . 

Many problems in computing and communication theory can be mapped to spin systems. For instance, error- 
correcting codes, in particular LDPC codes [l(| and hard computational problems such as K-SAT [TTJ] and graph- 
coloring jl2l . [l3| . can be mapped to diluted spin systems with random p-spin interactions and local fields. In the 
coding example, interactions are defined by the parity-check constraints, while the local fields are induced by the 
codeword and received message. In the statistical physics treatment, for mathematical convenience, the message bits 
{0, 1} and '©' operation are mapped onto spin values {+1, —1} and multiplication using the mapping x — > (— l) x . 
Variables over a general finite field GF(q), q 2 are typically first mapped onto a binary string and then, using the 
spin values representation, transformed into a spin system [J|. 



III. MAPPING GF(q) MATRICES INTO SPIN SYSTEMS 

The transformation 

a{v) = (-1)*, (3) 

where a £ {+1, —1} and v £ {0, 1}, is usually employed to map the GF(2) variables onto the binary representation. 
This mapping can be generalized to any GF(q) without an intermediate use of the binary field. 
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Under the operation ©, GF(q) is homeomorphic to the cyclic group of order q and therefore has a representation 
as the complex g-th roots of unity with the group homeomorphism a : GF(q) — > C given by 
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(4) 
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This mapping has a clear geometric interpretation: 2irv/q is an angle in the unit circle, such that each element 
of the Galois field is being mapped onto a spin variable "pointing" in one of q possible angles. Using this mapping 
allows one to write the null-space constraint for a general vector v = (u 1 , ...,v N ) G GF(q) N as 
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Using the properties of the complex roots of unity, the above quantity A(q) can be shown (see appendix [X| to be 
real and equal to the order q of the field. 

Based on this representation, we can now define the "magnetization" of the original system in analogy with the 
spin system as 



1 N 
= -Ya 
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and the overlap between two configurations ar and a' as 



1 N 



(9) 



(10) 



3=1 



where we are now working with the spin variables already mapped to the the complex field C and therefore the 
operations of multiplication and addition correspond to the usual ones in C. 

It turns out that this kind of representation allows a factorization of the terms simplifying the equations and making 
the replica calculations simpler, as we will see in the following. 



IV. AVERAGE PROPERTIES OF THE KERNEL 



The dimension of the kernel of an M x N matrix A over GF{q) can be written as d(A) = log q fl where 

n = ^<5(Av,o), (ii) 
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is the number of vectors in the kernel, 6 is the Kroenecker delta and v £ GF(q) N . Direct calculation of f2 from 
equation pip by straightforwardly substituting the Kroenecker delta by its integral representation trivially reproduces 
the rank-nullity theorem. This calculation is not presented here. 

The quantity we are interested in here is the average kernel dimension, more specifically, its density in the limit of 
large matrices, defined as Ts where 

S = I l im ^ A ))a = H 1/lnO},, (12) 
where 1/T = Inq and M/N = A, with A a finite positive constant. Using the replica identity we can write 

(13) 

n=0 

The randomly chosen sparse matrices A have exactly Ki non-zero elements in the i-th row with probability 'P(K), 
K = (Ki, ...,Km), and Cj elements in the j-th column with probability V(C), C = (C%, Cjy), obeying the 
constraint A = Ki = ■ Cj, where A is the total number of non-zero elements of the matrix. The elements of A 
are sampled from the finite field GF(q) with independent equal probabilities V(Aij). 

Let us define, for brevity of notation, Z n = (Q n ) A - Although the calculations, presented in appendix [Bl are similar 
to related calculations in [3, HH , we will use a different approach which is conceptually clearer and has the advantage 
of allowing later generalizations. In this a ppr oach, we sum directly over all entries of the matrix instead of defining a 
connectivity tensor as used elsewhere fLU Il5j. 
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where the average is over the probability distribution "P(K,C,A) with x{A%j) = if Aij = and 1 otherwise, and 
the normalization M gives the number of matrices which obey the constraints averaged over the distributions of the 
entries. In this way, any type of constraint on the matrix can be readily included in the calculation, which could 
be rather cumbersome in other approaches, based on the introduction of a connectivity tensor as the corresponding 
constraints have to be written in terms of the tensor elements, which can be extremely complicated. 

We refer the reader to appendix [B] for details of the calculations. Using the replica symmetric ansatz, which is 
shown to be exact for this problem (see appendix [Dj we arrive at the following self-consistent saddle point equations 
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It must be noted that the above equations are only meaningful if A oc N. A striking property of the above equations 
is that they are completely independent of the specific distribution of the individual elements of the matrix, depending 
only on the distribution of K and C (and, obviously, of A). 

There exists two straightforward analytical solutions of the above equations, namely, the paramagnetic one given 

by 

tt(x) = 5(x), ir(x) = 5(x), (20) 

and the ferromagnetic solution 

n(x) = S(x - 1), ir(x) = 5(x - 1). (21) 

When substituted in the above equations, the paramagnetic solution gives the average kernel density as Ts — 
1 — A = 1 — M/N independently of the order q of the finite field used. In the case of LDPC codes defined by such 
matrices, this corresponds to random parity-check matrices that defines a code of rate R = 1 — A. The average rank 
density in this case is A. The ferromagnetic solution gives Ts = and the matrix is full rank; which incidentally 
means that such matrices cannot be used to define a parity-check code due to the lack of redundancy. 

These quantities can be associated to analogous quantities in the statistical mechanics framework. We start by 
associating the average rank density with the free-energy / and writing 

/ S £^ = i-r„, (22) 

which allows one to associate s with the entropy and the internal energy density being constrained to be u = 1. 
Defining /? = 1/T, equation ([2"2")> becomes 




(23) 



A 

where the Hamiltonian of the corresponding statistical mechanical system is formally 

H(v)=N -lnS(Av,0). (24) 

We solved the saddle point equations by means of population dynamics for three different cases, in all of which we 
keep K fixed 

1. Regular matrices - C and K fixed; 

2. Fixed K and C drawn from a multinomial uniform probability 

{) _ {MK)\ 1 . 



3. Fixed K while C values are drawn from a Poisson integer distribution of mean A/N = XK, for each column 
separately, until the limit of MK non-zero elements is reached. 

Results for the various cases are presented in Fig. [TJ The top left plot shows the theoretical thermodynamically 
dominant solutions (paramagnetic in the range < A < 1 and ferromagnetic for A > 1) having the lower free energy. 

The top right plot shows the results for the regular case (i). Solutions were obtained numerically by iterating 
equations (fTij]) and (fir?)) for the case of q = 4 and K = 200; C was varied from 2 to 250. Repeating the calculations 
for different values of q and K have produced similar results. We see that the stable solution is always paramagnetic, 
but becomes unphysical at A = 1 once the entropy, and consequently the dimension of the kernel, become negative. 
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FIG. 1: Average kernel dimension density (continuous lines) and average rank density (dashed lines) calculated as solutions to 
the replica symmetric saddle point equations. The top left plot shows the thermodynamically favored solution (paramagnetic 
for < A < 1 and ferromagnetic for A > 1). The top right shows the regular case (i) for fixed K and C. Cases (ii) and (iii) are 
presented at the bottom left and right, respectively. Note that numerical instabilities occur for specific A values. 

In the case of parity-check codes, this result means that the typical parity-check matrix defines a code of rate exactly 
(JV — M)/N. This is assumed for any parity-check matrix in most calculations in the literature and is confirmed by 
our results to be true on average; however, it is important to point out that the result is true in the limit of large 
matrices and is likely to have finite size corrections which may affect practical applications. 

Cases (ii) and (iii) are presented, respectively, at the bottom left and right of Fig. [TJ Although these cases do not 
rigorously obey the constraint that each Cj must be at most M, for large matrices and small values of K (which is 
what happens in practice) Cj is unlikely to exceed this value. However, instabilities can and indeed occur for specific 
A values, presumably due to instances where Cj takes higher values. 

The bottom left plot shows results for the case (ii), with q = 3, K = 4, N = 1000 and 1 < M < 1250. Also in this 
case, the stable dominant solution is paramagnetic. Numerical instabilities, which disappear slowly with the increase 
in the number of fields and steps in the population dynamics, emerge in the unphysical region and are shown in the 
figure. 

The behavior for case (iii) is a little more complex due to the nature of the distribution chosen. Using the average 
value XK for the variables Cj implies that, as A varies, their average value also changes. The plot shown was obtained 
for q = 2, K = 4, N = 250 and 1 < M < 300. There are clearly special points in this plot, which distinguish it from 
the previous cases. The first point separates A values which give rise to average connectivity values lower/higher than 
1 (left and right, respectively). Up to this point, the matrix has too many zero columns, pushing the kernel size to 
cover the full space of vectors. The other two points are where numerical instabilities emerge. Further calculations 
with different K values indicate that these points appear around the extremes of the interval 2/K < A < 3/K. 
Inside this interval, the average value of the Cj's equal to 2 (once we take it to be an integer). This value marks the 
percolation transition for binary matrices. Apart from these differences, the resulting curve seems to coincide with 
those obtained for the previous cases. 

The solution of kernel size problem is mathematically equivalent to the solution of LDPC in channels with infinite 
noise. As the solution in the latter is paramagnetic, we are led to speculate that it is the dominant solution also here 
up to the point where the quantity s, analogous to the entropy, becomes negative. From this point and on the solution 
becomes ferromagnetic. The numerical results seem to support this conjecture, although more careful calculations, 
varying all the parameters involved must be carried out to confirm this hypothesis more generally. 
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V. NUMBER OF MATRICES 



The number of GF(q) matrices given a connectivity profile is of significant interest within the discrete mathematics 
community. Exact results have been obtained for the case of finite binary matrices [l6| in the form of a formula that 
facilitates the calculation of their precise number. In this paper we will analyze the case of large GF(q) matrices and 
provide an expression for both their exact and average number. Given the precise number of non-zero elements per 
row K = (K\, Km) and per column C = (Ci, Cjy), one can write the number of matrices as 
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Note that we are using the summation directly over the entries of the matrix instead of the introduction of a 
connectivity tensor. In this way, the calculations are similar to the ones for obtaining the kernel dimension with the 
details given in[C] The final result is 



N A = (q - 1) 



A! 



(27) 



Note that the component on the right represents the number of binary matrices with the given non-zero elements 
profile. The factor (q — 1) A is the multiplicity of the non-zero entries which can have any non-zero value in the Galois 
field. 

If we consider a distribution V(K., C, A), we can look at the average number of matrices 



N A =((q- I)' 



A! 
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Note that we can write the joint probability distribution as 

V(K, C, A) = V(K\A, C)7>(A|C)P(C), 

and that 7>(A|C) = S I A, . Cj ) • Therefore, we have obtained for the average number of matrices 
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where the distribution ^(KIC) includes the constraint S^2 i Ki,'^2j C 3 

A simple calculation shows that for the regular case, where all C/s and Ki's are fixed (to C and K, respectively), 
and q — 2, the number of matrices scales as N CN . Therefore, a more appropriate quantity to calculate instead of the 
average number of matrices would be the quenched entropy 



1 lnN A ) =^^^P(K|C)P(C)ln 
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K C 
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(31) 



which scales as IniV. 

We analyze the behavior of this quantity for three different cases. We choose each Cj to be i.i.d. and K to be 
chosen from a multinomial distribution 



(32) 



for each realization of C. The three probability distributions for the variables Cj to be analyzed are 
1. uniform in the interval [0,2(7] 

V(C j ) = l/(2C+l); 



(33) 
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TABLE I: Asymptotic values of H* for large A 



c 


As. Value 


5 


29.66 


10 


60.73 


20 


123.20 



2. binomial in the interval [0, M] 



3. Zipf distribution for Cj = 1, M 

nC j )= ° jS s , (35) 

where C is the mean of the distributions. The motivation for choosing these connectivity profiles is that they appear 
to be the most commonly analyzed and feature (especially the latter) in recent analysis and modeling of networks. 

Results for the binomial (dashed line) and uniform (dotted line) distributions with means C = 5.0, 10.0, 20.0, q = 2 
and N = 300 are plotted in Fig. [21 together with the value of 2 with constant Cj = C and Kj = C/X values for all i 
and j. This function is explicitly given by 

S* = Cln(q - 1) - \nC\ + jj\n(NCy. - Xln (C / \)\, (36) 

and we can obtain its asymptotic behavior for small and large A as 

A<1^S* =CTnO-l)-lnC! + ClnAiV, (37) 
A > 1 =^S* = C\n(q - 1) - lnC 1 ! + ClnCN + (7 - 1)0, (38) 

where 7 ks 0.577216 is the Euler-Mascheroni constant. Asymptotic limits for large A are given in table HI 

For large A values the result for constant C and K upper-bounds the other two distributions. Additional calculations 

seem to indicate that it is always the case for any distribution, although a proof for this conjecture is still sought. 

This implies that if we keep the number of columns constant and increase the ratio A by adding rows, whenever the 

number of rows is much larger than the number of columns, the average number of matrices becomes independent of 

both the ratio and number of rows. The plots also suggest that the average number of matrices in these cases are 

basically defined by the average value of the C distributions. 

For small values of A, the uniform distribution continues to be upper-bounded by the constant distribution. The 

binomial distribution, however, is higher for a small interval around zero. This behavior is shown in the inset where 

lower C values give rise to higher S as A becomes smaller. 

Figure [3] shows the results for the Zipf distribution with different values for the power s compared with a uniform 

distribution in the range [0, M}. In this case, the mean of the distributions vary with A. We see that, although the 

average value of the Zipf distributions increasingly differs from the uniform value M/2 as s increases, the average 

number of matrices actually becomes highly similar. 



VI. CONCLUSIONS 



We have introduced a new mapping of Galois matrices to spin systems based on the group homeomorphism between 
GF(q) under addition mod q (denoted by ®) and the complex q-th roots of unity. In addition, we have introduced a 
different way for summing over random matrices that can be generalized to include any kind of connectivity constraint 
and is conceptually cleaner and simpler than the existing approaches. The new mapping and alternative summation 
over random matrices allows for a factorization of the constraints, which simplifies calculations of the kernel and the 
number of matrices under various connectivity profiles. 

Using the replica approach and these new introduced techniques, we calculated the average dimension of the kernel 
for a general distribution of non-zero entries and solved the resulting equations numerically, finding that the average 
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FIG. 2: Values of the quenched entropy S versus A for the different distributions and various C values (C = 5, 10,20), with 
multinomial K: constant (continuous line), binomial (dashed line) and uniform (dotted line). The inset shows in detail the 
small A regime, where just the binomial and constant distributions are represented. The higher lines on the right correspond 
to the higher C values. 
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FIG. 3: Values of H versus A for the uniform distribution (dashed line) and the Zipf distribution (continuous lines) for 
s = 1, 3, 4, 10, respectively, from bottom to top. 

kernel density is 1 — M/N in all cases studied. We conjecture that this result is always valid. Based on the analogy 
with thermodynamical quantities corresponding to free energy, internal energy and Hamiltonian, we showed that 
the replica symmetric ansatz in this case must be exact. With the same techniques, we were also able to find the 
total number of large matrices for fixed K and C and their average number, which was then computed for different 
distributions of theoretical and practical relevance. 

The results presented have practical relevance in a number of areas, including coding network modeling and some 
biological models. With respect to LDPC codes, the average kernels density result implies that randomly generated 
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LDPC codes typically define codes of rate exactly 1 — M/N, an assumption which is generally made but lacks rigorous 
derivations. Also, as the parity-parity check matrix can represent the connectivities in graphs (see [l7j). the results 
obtained for the average number of matrices provide a principled approach to determine the average number of possible 
graphs with a given connectivity distributions of a more general nature than the connectivity profiles examined in 
this paper. 
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APPENDIX A: PROOF OF A(q) = q 
In this appendix we prove the statement made in section [IVl that A(q) = q where 
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Let us use the notation 

z(m) = e~ m , 

and noting that unit complex roots appear in complex conjugate pairs, we write 

( n£l )/2 [l -*(«»)] [l-2(m)], godd, 
A(?) = I 

{ m { ^ 2 i /2 [1 - «(m)][l - 2(m)], q even, 
where the bar indicates a complex conjugate. Using 
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equation (|A3p becomes 
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As the sin function is positive in the interval (0, w) and sin(7r/2) = 1 we can write, for any q, 
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Using the known identity [16 
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divided by sinx and taking x — > 0, one obtains 
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which by substituting into equation (|A6|) gives the desired result 
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APPENDIX B: REPLICA SYMMETRIC SADDLE POINT EQUATIONS 



Using integral representations for the first two sets of Kroenecker delta functions, we can write the averaged 
replicated kernel size defined in equation (|14|) as 
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where (g> and © indicate multiplication and summation on GF(q), respectively, and 
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Using the representation of the parity-check constraint given in equation ([6]), the product over replica indices of the 
delta function can be written as 
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l + £>i(*,a)G(s) 



s=l 



(B3) 



1 " 

"EE E G ( s i)" •G(s r )F i (s 1 ,a 1 ) •••F l ( Sr ,a r ), 



with 



r=0 (ai---a r ) si, 



G(s) = (— l) s exp( -mi ] • • • exp ( -m. 



(B4) 



and 



Fi(s, a) = exp 



— (A*i ® «i) 



• • • exp 



N 



where we defined, for simplicity 



= Y[lj(s,a,Aij), 

3=1 



7j(s,a, Ay) = exp 



(B5) 



27TJ 

9 



We can now write the partition function as 

WrrE/^niE E E g( Si) -g(, 



i=l r=0 (ai---O r ) si. 



dWi 1 
2™ 



r, 



(B6) 



(B7) 
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where 



E 



Y[7j(sx,ai,Aij) ■ ■■'y j (sr,ar,Ai j ) 



J[ V{A l0 ){W t Z 3 ) x(A ^ ) 1] {s l ,a ll A l3 ) ■ ■ ■ l3 {s r , a r , Ay) 



(B8) 



1 9-1 

1 + - J~] V(Ajj = ^WiZj-f^si.ai.h) ■ ■ ■^(sridr, ft) 



h=l 



where we define, for convenience, p = V(Aij — 0). Let us define a probability distribution over the values of h as 



V(h) = 



r(A l3 = h) 
i-p 



(B9) 



in such a way that h varies from 1 to q — 1 and the probability over this range is correctly normalized. Then 

1-^ 



J 
N 



1 + 



p 



W i Zj{-y j (s 1 ,ai,h) ■ ■ ■lj{s r ,a r ,h)) h 



p N T E 



1 ti l ')w!z j3 -.-z n 



(BIO) 



'=0 (ji-.j,) 

x (-y jl (s 1 ,a 1 ,h) ■ ■ ■ j n {s r , a r , h)) h ■ ■ ■ (7^ (s 1} ax, ft) ■ ■ ■j Jl {s r ,a r ,h)) h . 
The integrals over the Wi's, acting on the IYs, select the power of Wi to be Ki and we therefore obtain 



M n 



2 n = (nJ2 f DZ U\zZ E E G(sx)---G(s r ) £ 



J 3l ^JKi 



\ {va} i=l \r=0 (ai-a r > «,-,Sr (j'l-Jjfj) 

x (7ii (si, ai, /i) • ■ • 7 3l (s r , Or, ft)} fe • • • (7j Kj (si, ai, ft) • • ■ 7i^ fan a r , ft)) h }) 



K,C,A 



/A/ ( n 
DZ U E E E gm-gov 

^v a > i=l ^r=0 (oi-o r ) si,— ,«r 

1 W 

V E ^'(7i( s l> «1: M ■ ' ■ 7i(Sr, Or, ft)) 



(Bll) 



KA 



K,C,A 



where 



(B12) 



The calculation of TV is similar to the calculation of the number of matrices shown in appendix [C] and we end up 
with 

(B13) 



q nM N W' 



where is exactly the number of binary matrices (q = 2) as calculated in appendix [Cj Introducing the replica 
overlaps 



1 N 

Q^i'-'-'Z) = n E z i(7i(»i»oi> A) ■ • • 7j(sr,ar, J0) A , 



(B14) 



i=i 
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and the corresponding auxiliary variables Q a ^''Z) by means of Dirac delta functions, we can express the partition 
function as 



X ( K- 



A", 



i {{vi} J 

= Jdqdq cxp ( .vXov::y;:;) 



K,C,A 



(B15) 



x ( q 



iM 



K, 



TnfE G ^)--- G ^)fe:.-.t; 



A', 



x H\ E [Zl^<ai-'.'.a'>(7j(«i,ai,M- ' ■1j(*r,a T ,h)) h 

1 \{vi} 



K,C,A 



where 



DQDQ = J] 



(B16) 



and the summations run over all the allowed values of r, (ai • • • a r ) and si, . . . s r . 
Under the assumption of replica symmetry in the form 



(B17) 
(B18) 



where the averages over x and x are taken with respect to the field distributions ir(x) and 7r(x) respectively, we can 
show by straightforward algebraic manipulations that 



E Q^Z^Z = QoQodi + (« - i)**a, 4> 



where it is easy to see that 

5^G(a) = A( 9 )-l = g-l, 

s 

and 

E [E ^(ii".'or) °1» *0 • • • 7j(*r, Or, &))/, 

/ ( 9-1 C 3 

Q^EII^ + ^W 

\ [w=0/=l 



E G w 



A"; 



(B19) 
(B20) 

(B21) 



(B22) 



x,h 



with 



9-1 

uj(v,hi) = J^exp 



.27TS,, . 

i— -(hi <g> w) 



g — 1, if/ij (g) u = 0, 
— 1, otherwise. 



(B23) 
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We can simplify the last equation by noting that 

q-l Cj Cj 



^2 n t 1 + ^ h i)*i] = n t 1 + - + (9 - 1) n ( x - *o- 



u=0 2 = 1 



i=l 



Z=l 



(B24) 



Let us write 



Z n = I DQDQe NS , 



(B25) 



with 



5 = -4 ln ^i 2) - ™ Aln <7 - QoQo([l + (q ~ l)xx] n ) x £ + i ln$ 



iV 



(B26) 



where 



"A! 



A", 



l + (g-i)II 



1=1 



x n( <^ n + ^ - 1)^] + (? - 1) n - ^) 
j \ i i=i i=i 



x K,C,A 



(B27) 



Let us define a = NQoQq. For n -C 1, we can consider only the leading contributions in the number of replicas, 
which gives 



ln$ = lne(a) + — ( In 



A", 



1 + (« — 1) JJasi 

i=l J / x / K,C,A 



c 3 - 



(B28) 



^E\AF\ ln jII[ 1 + (9- 1 )*'] + (?- 1 )II( 1 - 4 ') 



x K,C,A 



with 



e(a) 



A! 



(B29) 



K,C,A 



Substituting the above formulas in s for n — > 0, the extremization with respect to Qoj Qoi 7r(x) and 7r(x) leads to 
the saddle point equations (fTij)) . (fT!))) and (fT7|) . 



APPENDIX C: NUMBER OF MATRICES 



Here we give the detailed calculation of the average number of GF(q) (M) x N matrices for large N and N. 
Repeating the formula given in section [V] we have 



N A = £ 



M 



N 



A/ 



(CI) 
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with x(Aij) = if Aij = and 1 otherwise. Following a similar procedure as in[Bl we use the integral representations 
of the Kroenecker delta functions to write it as 



N A = I DWDZY[Y^^ Z o) X{A ^ ] 

i,j A i:j 

= j DWDZj\[l + (q-l)W i Z j ] 
= I DWDZY[ 



<j> DWDZ 



where 



r=1 (jl-jr) 

M 

1 + E E E (<l-l) ri+ - +rs W[i---W[ : F(r 1 ,Z)---F(r s ,Z) 

s=l (ii— i s ) ri,...,r a 



F(r,Z)= Z n ---Z Jr 

(h—jr) 



(C2) 



(C3) 



The integrals over the W's can pass through the summations and will factorize to give the corresponding Kroenecker 
delta functions resulting in 



N A = (q - K > j DZF{K U Z) ■ ■ ■ F{K M , Z) 
= (q-l) A fDZF(K u Z)-"F{K M ,Z) 
= (q-lf j>DZ\[ Z h'--Z jKi 

1 (jl"\JK 4 ) 



(«-i) A /^n^(g^ 

N 



K, 



(C4) 



('1 

(g-l) A 

(g-l) A /A\ (K-C{\ (\-C\ 



DZ Z h--- Z ± 



Cn 



■ Cjv- 



which gives the final result 



N A = 



(g-l) A A! 

rWTW 



(C5) 



APPENDIX D: PROOF OF REPLICA SYMMETRY 



Using the fact that the random matrices can be seen as statistical physics systems with Hamiltonian H(v) = 
N — ln<5(Av, 0) we now prove that this implies that the replica symmetric solution is the exact one. In fact, the form 
of the Hamiltonian implies that 



E 5 ( Av '°) 



-d{A) 



(Dl) 
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The distribution of the overlaps of the spins is given by 
V(p) = (s\ 



rr.rr' 



-2d(A) 



3=1 



Let us call 



g(v,v') = 6 



P 



1 N 

-Y 



exp 



2-rti 
Q 



(v j + v' j ) 



and note that g(v, v') = g(0, v © v'). Therefore we can write 

V(p) = q- 2d(A) £ S(Av, 0)5(Av\ 0) 5 (0, v © v') 

v.v' 

= q- 2d{A) ^ 5(Av, 0)6(Av\ 0) S(u, v © v')s(0, u) 



<1 



-2d(A) 



-2d{A) 



u 



E S(Av, 0) E <*(^ V ': )<*( u > v © v ') 



E^v,0)5(A(u©(-v)),0) 



J2S(Au,0)g(0,u) 



(D2) 



(D3) 



(D4) 



Therefore, the distribution of the overlaps is the same as the distribution of the magnetization in the spin systems. 
This implies that there is no spin glass phase in the system and, therefore, no replica symmetry breaking Q. The 
above calculation can also be viewed as a consequence of the gauge invariance of the Hamiltonian with respect to the 
transformation v — > v v', where Av' — 0, which leads basically to the same calculation above. 
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